Skip to content

Fill CUDA opset gap for ReduceMax and ReduceMin (18 → 20)#27755

Merged
tianleiwu merged 5 commits intomainfrom
copilot/update-onnx-reduce-operators
May 5, 2026
Merged

Fill CUDA opset gap for ReduceMax and ReduceMin (18 → 20)#27755
tianleiwu merged 5 commits intomainfrom
copilot/update-onnx-reduce-operators

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 19, 2026

Description

Extends CUDA ReduceMax and ReduceMin kernel registrations from opset 18 to opset 20.

  • reduction_ops.cc: Added REGISTER_KERNEL_VERSIONED_RANGE_AXES_INPUT_TYPED macro for versioned ranges requiring InputMemoryType(OrtMemTypeCPUInput, 1). Split both operators from 2-way (1–17, 18+) to 3-way (1–17, 18–19, 20+).
  • cuda_execution_provider.cc: Capped opset 18 forward declarations and BuildKernelCreateInfo entries to versioned 18–19. Added opset 20 non-versioned entries for both operators.

Type coverage maintained as-is: ReduceMax (float, double, MLFloat16, int32_t, int64_t), ReduceMin adds int8_t, uint8_t.

Motivation and Context

ReduceMax and ReduceMin CUDA registrations stopped at opset 18; ONNX latest is opset 20. Models exported with opset 19–20 could fail to find a matching CUDA kernel for these ops.

Follows the same pattern used in #27735 (TopK) and other opset gap PRs tracked in #27729.

… 18 to opset 20

Co-authored-by: tianleiwu <30328909+tianleiwu@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can commit the suggested changes from lintrunner.

Comment thread onnxruntime/core/providers/cuda/reduction/reduction_ops.cc Outdated
@tianleiwu tianleiwu requested a review from Copilot May 5, 2026 19:36
@tianleiwu tianleiwu marked this pull request as ready for review May 5, 2026 19:37
@tianleiwu tianleiwu requested a review from justinchuby May 5, 2026 19:38
@tianleiwu tianleiwu requested a review from titaiwangms May 5, 2026 19:39
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Extends CUDA ONNX-domain ReduceMax/ReduceMin registrations to cover the opset 19-20 gap by splitting the old opset-18+ registrations into 1-17, 18-19, and 20+. This fits the broader CUDA provider work to keep kernel registry coverage aligned with newer ONNX opsets.

Changes:

  • Added a versioned CUDA reduction registration macro that preserves axes as a CPU input for ranged opset registrations.
  • Split ReduceMax and ReduceMin CUDA kernel declarations/registrations into 18-19 and 20+ entries in the CUDA provider.
  • Added CUDA-focused opset-20 reduction tests for multiple data types and keepdims settings.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
onnxruntime/test/providers/cpu/reduction/reduction_ops_test.cc Adds CUDA-oriented opset-20 tests for ReduceMax/ReduceMin.
onnxruntime/core/providers/cuda/reduction/reduction_ops.cc Splits reduction kernel registrations by opset range and adds a helper macro for ranged registrations with CPU axes input.
onnxruntime/core/providers/cuda/cuda_execution_provider.cc Updates CUDA EP forward declarations and kernel registration table entries for the new opset ranges.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/core/providers/cuda/cuda_execution_provider.cc
Comment thread onnxruntime/test/providers/cpu/reduction/reduction_ops_test.cc
…able

Add forward declarations and BuildKernelCreateInfo entries for
ReduceMax with int8_t and uint8_t types at opsets 1-17, 18-19, and 20
in the CUDA execution provider registration table. These were already
registered in reduction_ops.cc but missing from the provider table,
which would cause kernel lookup failures for 8-bit ReduceMax models.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/test/providers/cpu/reduction/reduction_ops_test.cc
@tianleiwu tianleiwu enabled auto-merge (squash) May 5, 2026 22:00
@tianleiwu tianleiwu merged commit 28bcc9c into main May 5, 2026
91 of 93 checks passed
@tianleiwu tianleiwu deleted the copilot/update-onnx-reduce-operators branch May 5, 2026 23:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants